Skip to content

[TG-9002] Symex: revert guards after call return #5003

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

smowton
Copy link
Contributor

@smowton smowton commented Aug 9, 2019

Executing a function may have a cumulative effect on the state guard. For example, if the callee contained ASSUME statements that rendered one or more control-flow options unviable then the guard might still embody that restriction (i.e. for if(x) ASSUME(false) the guard might still be !x). However, on function return we know that all control-flow paths have converged or been shown to be unviable, so we can restore the simpler guard as it was when we entered the callee function.

Exceptions:
(a) if the guard is false it would be correct but inefficient to restore it; keep it false until we find a convergeance with another viable path
(b) if we're doing path-sensitive symex then we do tail duplication, and there are no control-flow convergeances. Keep the guard as-was.
(c) if we're executing a multi-threaded program then symex_assume_l2 uses the guard to accumulate assumptions, which we must not discard.

In truth this optimisation is applicable whenever a block postdominates another, but function structure gives us a cheap way to establish postdominance without analysis (in the absence of setjmp/longjmp at least)

@smowton
Copy link
Contributor Author

smowton commented Aug 9, 2019

Poster-child example courtesy of @hannes-steffenhagen-diffblue:

#include <assert.h>
#include <string.h>
#include <stdlib.h>


char *nondet_str(void) {
  const char empty[] = "                               ";
  char *result = strdup(empty);
  int x;
  __CPROVER_assume(x >= 0 && x <= 9);
  result[30] = (char)('0'+x);
  return result;
}

int main(void) {
  long x = 2;
  const char *sx = "                              2";
  for(int i = 0; i < 10; ++i) {
    assert(strtol(nondet_str(), NULL, 10) == x);
  }
  return 0;
}

This previously got bogged down in an ever-growing guard, I think because the guard |= operator didn't quite manage to squash the guard back down so that it left strtol in the same state as it entered, and so the run slowed ever down and down. With this patch that example takes around 5 seconds to complete on my machine.

@smowton smowton force-pushed the smowton/feature/revert-guards-after-calls branch from e7525ef to a1a7b6c Compare August 9, 2019 16:56
@kroening
Copy link
Member

Guards getting fat & monstrous has bugged me for many years -- and I believe it's a major performance problem.
I agree it's probably not sensible to try to solve this exactly, but that cheap, syntactic stuff is the way to go.
Other ideas like "function body"?

@kroening
Copy link
Member

Another BTW: I think that path-pased symex suffers from this just as much.

@smowton
Copy link
Contributor Author

smowton commented Aug 10, 2019

@kroening I don't follow "function body"?

@smowton
Copy link
Contributor Author

smowton commented Aug 10, 2019

SVCOMP/Java: 3 tests stop timing out (timeout = 30 seconds), overall CPU time down 5%

@smowton
Copy link
Contributor Author

smowton commented Aug 10, 2019

SVCOMP/C subsample: 3 tests stop timing out (timeout = 30 seconds), total CPU time down 2%

@smowton smowton force-pushed the smowton/feature/revert-guards-after-calls branch from a1a7b6c to 9c9c883 Compare August 10, 2019 20:58
@smowton
Copy link
Contributor Author

smowton commented Aug 12, 2019

CI failure: this harms performance for cases with many conditional ASSUME false; instructions in one function -- previously these would be ignored for guard purposes and the unviable instructions executed regardless, which was wasteful but lead to the guard being simplified more effectively. This is especially punitive with BDD guards.

Note this already happens now when paths are truncated by falsifying the guard -- for example exceeding depth bound, or the (specifically recursion) unwind bound.

Proposed course of action: merge this PR with the guard reversion but not the path truncation cleanup, then separately pursue guard shrinkage when paths are truncated, then make path truncation uniform so that we don't execute instructions after an ASSUME false.

@smowton smowton changed the title Symex: revert guards after call return [TG-9002] Symex: revert guards after call return Aug 12, 2019
@smowton
Copy link
Contributor Author

smowton commented Aug 12, 2019

In view of this I've removed the cleanup of ASSUME vs. guard manipulation from this PR and only make the on-function-return guard simplification. I'm re-running my benchmarks on this modified version and expect CI to pass here.

@smowton smowton force-pushed the smowton/feature/revert-guards-after-calls branch 2 times, most recently from 0911519 to ccf6391 Compare August 13, 2019 08:44
@smowton
Copy link
Contributor Author

smowton commented Aug 13, 2019

Current status: causes about 10 SVCOMP/Java tests to stop timing out and instead return incorrect answers. Looking into it...

@codecov-io
Copy link

codecov-io commented Aug 13, 2019

Codecov Report

Merging #5003 into develop will decrease coverage by 0.04%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #5003      +/-   ##
===========================================
- Coverage    69.47%   69.43%   -0.05%     
===========================================
  Files         1310     1310              
  Lines       108763   108741      -22     
===========================================
- Hits         75566    75502      -64     
- Misses       33197    33239      +42
Impacted Files Coverage Δ
src/goto-symex/symex_main.cpp 86.83% <100%> (+0.04%) ⬆️
src/goto-symex/frame.h 100% <100%> (ø) ⬆️
src/goto-symex/symex_goto.cpp 94.91% <100%> (+0.02%) ⬆️
src/goto-symex/symex_function_call.cpp 86.8% <100%> (+0.18%) ⬆️
src/goto-symex/call_stack.h 100% <100%> (ø) ⬆️
src/goto-symex/goto_symex_state.cpp 91.72% <100%> (ø) ⬆️
src/solvers/prop/bdd_expr.cpp 60.97% <0%> (-29.57%) ⬇️
src/solvers/bdd/bdd_miniBDD.h 84.61% <0%> (-15.39%) ⬇️
src/solvers/bdd/miniBDD/miniBDD.inc 93.47% <0%> (-6.53%) ⬇️
src/solvers/strings/string_builtin_function.h 75.26% <0%> (-2.16%) ⬇️
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bb6baff...66133a6. Read the comment docs.

@smowton
Copy link
Contributor Author

smowton commented Aug 13, 2019

Conclusion: reintroduced those parts of the cleanup that ensure an ASSUME / real constraint is generated wherever we truncate execution due to the unwind or depth limits, as otherwise using the dominant guard would actually mean broadening the guard by re-including impossible paths. This PR now (a) ensures we generate a real constraint whenever we truncate a path, and (b) uses guard-as-of-entering-function whenever we exit that function, possibly broadening it by including paths that have been ruled out and/or simplifying it better than the expression-simplifier is capable of.

Forthcoming work will separate the concepts of guards and reachability, such that we can reduce the execution guard at control-flow convergances even when one of the predecessors is unreachable, while maintaining the benefit of not generating constraints for unreachable code.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 9aec581).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/122999685

@smowton
Copy link
Contributor Author

smowton commented Aug 14, 2019

Re-benchmarked with neutral results: all I claim now is that specific examples where the guard gets out of hand, like Hannes' example above, become analysable with this change but were previously impractically slow.

main::1::i!0@1#[0-9]+ = 2
--
This checks that code rendered unreachable by a __CPROVER_assume(0) does not generate constraints.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant blank line

Executing a function may have a cumulative effect on the state guard. For example, if the
callee contained ASSUME statements that rendered one or more control-flow options unviable
then the guard might still embody that restriction (i.e. for if(x) ASSUME(false) the guard might
still be `!x`). However, on function return we know that all control-flow paths have converged
or been shown to be unviable, so we can restore the simpler guard as it was when we entered
the callee function.

Exceptions:
(a) if the guard is false it would be correct but inefficient to restore it; keep it false until
    we find a convergeance with another viable path
(b) if we're doing path-sensitive symex then we do tail duplication, and there are no control-flow
    convergeances. Keep the guard as-was.
(c) if we're executing a multi-threaded program then symex_assume_l2 uses the guard to accumulate
    assumptions, which we must not discard.

In truth this optimisation is applicable whenever a block postdominates another, but function
structure gives us a cheap way to establish postdominance without analysis (in the absence of
setjmp/longjmp at least)
Whenever we truncate a particular execution path, whether due to --unwind or --depth settings,
we should generate an assumption so that it is not necessary to record the rejected path in the
state guard.

For example, before if we exceeded the depth limit then we would set the guard to false, and rely
on carrying this missing path forwards to any future assume or assert statement. By emitting an
assume we explicitly make a constraint ruling this path out, meaning that downstream guards no
longer need to carry that information (for example, where before we might have had guard `x` and
VCCs `!x => y` and `!x => z`, we can now emit a constraint `!x`, restore the guard to `true` and
produce VCCs `y` and `z`).

Note this commit does not actually change when the guard is set false and when it is not, as this
will have additional performance consequences and should be benchmarked independently.
@smowton smowton force-pushed the smowton/feature/revert-guards-after-calls branch from 91636b4 to 66133a6 Compare August 14, 2019 11:03
Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️
This PR failed Diffblue compatibility checks (cbmc commit: 5dbbf29).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/123209903
Status will be re-evaluated on next push.
Common spurious failures include: the cbmc commit has disappeared in the mean time (e.g. in a force-push); the author is not in the list of contributors (e.g. first-time contributors); compatibility was already broken by an earlier merge.

Copy link
Contributor

@allredj allredj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✔️
Passed Diffblue compatibility checks (cbmc commit: 66133a6).
Build URL: https://travis-ci.com/diffblue/test-gen/builds/123215484

@kroening kroening self-assigned this Aug 14, 2019
@kroening kroening merged commit 60b2851 into diffblue:develop Aug 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants